Image Classification¶

Hayden Hoopes

In this project, I will build several different convolutional neural networks (CNN) to try and see how good of performance I can get out of the models using different strategies. Specifically, I will be designing models to classify a variety of scenery as either buildings, forest, glacier, mountain, sea, or street.

The first CNN that I build will be a baseline model that is a basic model that doesn't use any regularization or data augmentation. This baseline model will also be trained only on the training data that I provide it, which consists of 6000 images. The model will not have seen any other images, ever. I will then use this model to compare and check if using transfer learning results in better model performance.

The second CNN that I will build will use transfer learning and feature extraction to use the model VGG16's pre-trained features for detecting images and use those to make a model that is better than the baseline. This model will essentially draw on the pre-built training power of VGG16 to make better predictions on the images.

The third and final CNN that I will build will use transfer learning to fine tune some of the parameters of VGG16 to exactly meet the requirements of the images in this data set.

Data Preprocessing¶

In this step, I pull the images from their base directories (downloaded from Kaggle) into a new file structure that will work with tensorflow. I then use the images to create training, test, and validations sets of data.

In [5]:
import os, shutil, pathlib

categories = ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']

for category in categories: 
    dir = pathlib.Path('intel') / 'train' / category
    validation = pathlib.Path('intel') / 'validation' / category
    os.makedirs(dir)
    os.makedirs(validation)
    files = os.listdir(pathlib.Path('seg_train', 'seg_train', category))
    for file in files[:1000]:
        shutil.copyfile(src=pathlib.Path('seg_train', 'seg_train', category, file),
                        dst=pathlib.Path(dir, file))
    for file in files[1000:1500]:
        shutil.copyfile(src=pathlib.Path('seg_train', 'seg_train', category, file),
                        dst=pathlib.Path(validation, file))

for category in categories: 
    dir = pathlib.Path('intel') / 'test' / category
    os.makedirs(dir)
    files = os.listdir(pathlib.Path('seg_test', 'seg_test', category))
    for file in files[:1000]:
        shutil.copyfile(src=pathlib.Path('seg_test', 'seg_test', category, file),
                        dst=pathlib.Path(dir, file))
In [216]:
import tensorflow as tf
train = tf.keras.utils.image_dataset_from_directory(
    'intel/train/',
    labels='inferred',
    label_mode='categorical',
    batch_size=32,
    image_size=(180, 180),
    shuffle=True,
    seed=1
)

test = tf.keras.utils.image_dataset_from_directory(
    'intel/test/',
    labels='inferred',
    label_mode='categorical',
    batch_size=32,
    image_size=(180, 180),
    shuffle=True,
    seed=1
)

pred = tf.keras.utils.image_dataset_from_directory(
    'intel/validation/',
    labels='inferred',
    label_mode='categorical',
    batch_size=32,
    image_size=(180, 180),
    shuffle=True,
    seed=1
)
Found 6000 files belonging to 6 classes.
Found 3000 files belonging to 6 classes.
Found 3000 files belonging to 6 classes.

Baseline Model¶

The model below is a basic CNN model trained on only the 6000 images provided to it in the training set. This model will serve as the benchmark for determining how well the other models perform.

As seen below, the accuracy for the baseline model when used on the test set was about 78%. Random guessing would have only given an accuracy of about 16%, assuming all of the classes were evenly distributed. That means that this initial model is doing pretty good to start.

In [217]:
from tensorflow import keras
from tensorflow.keras import layers

def build_model():
    inputs = keras.Input(shape=(180, 180, 3), name='Input')
    x = layers.Rescaling(1./255, name='rescaling')(inputs)
    x = layers.Conv2D(filters=32, kernel_size=3, activation='relu', name='convolution_layer_1')(x)
    x = layers.MaxPooling2D(pool_size=2, name='pooling_1')(x)
    x = layers.Conv2D(filters=64, kernel_size=3, activation='relu', name='convolution_layer_2')(x)
    x = layers.MaxPooling2D(pool_size=2, name='pooling_2')(x)
    x = layers.Conv2D(filters=128, kernel_size=3, activation='relu', name='convolution_layer_3')(x)
    x = layers.MaxPooling2D(pool_size=2, name='pooling_3')(x)
    x = layers.Conv2D(filters=256, kernel_size=3, activation='relu', name='convolution_layer_4')(x)
    x = layers.MaxPooling2D(pool_size=2, name='pooling_4')(x)
    x = layers.Conv2D(filters=256, kernel_size=3, activation='relu', name='convolution_layer_5')(x)
    x = layers.Flatten()(x)
    
    outputs = layers.Dense(6, activation='softmax', name='output')(x)
    
    model = keras.Model(inputs=inputs, outputs=outputs, name='base_cnn')
    
    model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
    return model
In [218]:
model = build_model()
model.summary()
Model: "base_cnn"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 Input (InputLayer)          [(None, 180, 180, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 180, 180, 3)       0         
                                                                 
 convolution_layer_1 (Conv2D  (None, 178, 178, 32)     896       
 )                                                               
                                                                 
 pooling_1 (MaxPooling2D)    (None, 89, 89, 32)        0         
                                                                 
 convolution_layer_2 (Conv2D  (None, 87, 87, 64)       18496     
 )                                                               
                                                                 
 pooling_2 (MaxPooling2D)    (None, 43, 43, 64)        0         
                                                                 
 convolution_layer_3 (Conv2D  (None, 41, 41, 128)      73856     
 )                                                               
                                                                 
 pooling_3 (MaxPooling2D)    (None, 20, 20, 128)       0         
                                                                 
 convolution_layer_4 (Conv2D  (None, 18, 18, 256)      295168    
 )                                                               
                                                                 
 pooling_4 (MaxPooling2D)    (None, 9, 9, 256)         0         
                                                                 
 convolution_layer_5 (Conv2D  (None, 7, 7, 256)        590080    
 )                                                               
                                                                 
 flatten_8 (Flatten)         (None, 12544)             0         
                                                                 
 output (Dense)              (None, 6)                 75270     
                                                                 
=================================================================
Total params: 1,053,766
Trainable params: 1,053,766
Non-trainable params: 0
_________________________________________________________________
In [219]:
from tensorflow.keras.callbacks import ModelCheckpoint
modelcheckpoint = ModelCheckpoint(filepath='base_cnn_checkpoint.keras', save_best_only=True, monitor='val_loss')
In [221]:
history = model.fit(train, epochs=16, validation_data=pred, callbacks=[modelcheckpoint])
Epoch 1/16
188/188 [==============================] - 148s 790ms/step - loss: 1.1663 - accuracy: 0.5353 - val_loss: 0.9468 - val_accuracy: 0.6053
Epoch 2/16
188/188 [==============================] - 144s 766ms/step - loss: 0.9439 - accuracy: 0.6403 - val_loss: 0.8151 - val_accuracy: 0.6953
Epoch 3/16
188/188 [==============================] - 146s 774ms/step - loss: 0.7880 - accuracy: 0.7075 - val_loss: 0.7144 - val_accuracy: 0.7400
Epoch 4/16
188/188 [==============================] - 145s 772ms/step - loss: 0.6719 - accuracy: 0.7593 - val_loss: 1.0063 - val_accuracy: 0.6703
Epoch 5/16
188/188 [==============================] - 149s 791ms/step - loss: 0.5895 - accuracy: 0.7910 - val_loss: 0.8020 - val_accuracy: 0.7443
Epoch 6/16
188/188 [==============================] - 150s 797ms/step - loss: 0.5070 - accuracy: 0.8223 - val_loss: 0.8320 - val_accuracy: 0.7240
Epoch 7/16
188/188 [==============================] - 157s 837ms/step - loss: 0.4428 - accuracy: 0.8435 - val_loss: 0.8761 - val_accuracy: 0.7493
Epoch 8/16
188/188 [==============================] - 205s 1s/step - loss: 0.3652 - accuracy: 0.8757 - val_loss: 0.8408 - val_accuracy: 0.7730
Epoch 9/16
188/188 [==============================] - 251s 1s/step - loss: 0.3092 - accuracy: 0.8938 - val_loss: 0.8912 - val_accuracy: 0.7560
Epoch 10/16
188/188 [==============================] - 287s 2s/step - loss: 0.2566 - accuracy: 0.9120 - val_loss: 0.9330 - val_accuracy: 0.7680
Epoch 11/16
188/188 [==============================] - 293s 2s/step - loss: 0.2182 - accuracy: 0.9272 - val_loss: 1.1452 - val_accuracy: 0.7530
Epoch 12/16
188/188 [==============================] - 300s 2s/step - loss: 0.1989 - accuracy: 0.9352 - val_loss: 1.2648 - val_accuracy: 0.7610
Epoch 13/16
188/188 [==============================] - 296s 2s/step - loss: 0.1594 - accuracy: 0.9460 - val_loss: 1.9313 - val_accuracy: 0.7380
Epoch 14/16
188/188 [==============================] - 296s 2s/step - loss: 0.1756 - accuracy: 0.9498 - val_loss: 3.3373 - val_accuracy: 0.6747
Epoch 15/16
188/188 [==============================] - 294s 2s/step - loss: 0.1465 - accuracy: 0.9600 - val_loss: 2.3486 - val_accuracy: 0.7170
Epoch 16/16
188/188 [==============================] - 293s 2s/step - loss: 0.1467 - accuracy: 0.9593 - val_loss: 1.8843 - val_accuracy: 0.7790
In [222]:
import pandas as pd
import matplotlib.pyplot as plt

pd.DataFrame(history.history)[['accuracy', 'val_accuracy']].plot()
plt.show()
In [223]:
test_model = keras.models.load_model('base_cnn_checkpoint.keras')
test_model.evaluate(test)
94/94 [==============================] - 21s 222ms/step - loss: 0.6973 - accuracy: 0.7430
Out[223]:
[0.6973123550415039, 0.7429999709129333]

Transfer Learning - Feature Extraction¶

In this step, I will use the pre-trained model VGG16 to extract image features that may help my CNN identify shapes a little more easily. I would expect this model to perform even better than the previous baseline model did.

First, I had to import the VGG16 model and then split out the features and labels of my data. Then, I had to "predict" the weights of each feature on my training data so that the high level parameters from VGG become integrated with the data set. That way, the features that I pass into the CNN that I make later have essentially already gone through VGG16 and had the high level features interpreted. My CNN will only need to interpret the low level features that are specific to this project.

As seen below, the model peaked in validation accuracy around epoch 2 or 3, indicating that the model may have started overfitting towards the end. However, the best model still had an accuracy of about 90.7% when applied to the test set, which is better than the baseline model's accuracy of 78%.

In [224]:
vgg = keras.applications.vgg16.VGG16(
    weights='imagenet',
    include_top=False,
    input_shape=(180, 180, 3))

vgg.summary()
Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_6 (InputLayer)        [(None, 180, 180, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 180, 180, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 180, 180, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 90, 90, 64)        0         
                                                                 
 block2_conv1 (Conv2D)       (None, 90, 90, 128)       73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 90, 90, 128)       147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 45, 45, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 45, 45, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 45, 45, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 45, 45, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 22, 22, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 22, 22, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 22, 22, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 22, 22, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 11, 11, 512)       0         
                                                                 
 block5_conv1 (Conv2D)       (None, 11, 11, 512)       2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 11, 11, 512)       2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 11, 11, 512)       2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 5, 5, 512)         0         
                                                                 
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
In [225]:
import numpy as np

def get_features_and_labels(data):
    all_features = []
    all_labels = []
    raw_images = []
    
    for images, labels in data:
        raw_images.append(images)
        preprocessed_images = keras.applications.vgg16.preprocess_input(images)
        features = vgg.predict(preprocessed_images)
        all_features.append(features)
        all_labels.append(labels)
    return np.concatenate(all_features), np.concatenate(all_labels), np.concatenate(raw_images)

train_features, train_labels, raw_train_images = get_features_and_labels(train)
test_features, test_labels, raw_test_images = get_features_and_labels(test)
validation_features, validation_labels, raw_validation_images = get_features_and_labels(pred)
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 1s 1s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 2s 2s/step
In [226]:
inputs = keras.Input(shape=(5, 5, 512))
x = layers.Flatten()(inputs)
x = layers.Dense(128)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(6, activation="softmax")(x)

model = keras.Model(inputs, outputs)
model.compile(loss='categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy']
             )

model.summary()
Model: "model_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_7 (InputLayer)        [(None, 5, 5, 512)]       0         
                                                                 
 flatten_9 (Flatten)         (None, 12800)             0         
                                                                 
 dense_8 (Dense)             (None, 128)               1638528   
                                                                 
 dropout_4 (Dropout)         (None, 128)               0         
                                                                 
 dense_9 (Dense)             (None, 6)                 774       
                                                                 
=================================================================
Total params: 1,639,302
Trainable params: 1,639,302
Non-trainable params: 0
_________________________________________________________________
In [227]:
modelcheckpoint = ModelCheckpoint(filepath='cnn_checkpoint_feature_extraction.keras', save_best_only=True, monitor='val_loss')

history = model.fit(train_features, train_labels, epochs=16, validation_data=(validation_features, validation_labels), callbacks=[modelcheckpoint])
Epoch 1/16
188/188 [==============================] - 7s 34ms/step - loss: 10.8321 - accuracy: 0.8445 - val_loss: 7.8652 - val_accuracy: 0.8833
Epoch 2/16
188/188 [==============================] - 6s 31ms/step - loss: 4.5351 - accuracy: 0.9197 - val_loss: 7.0565 - val_accuracy: 0.9023
Epoch 3/16
188/188 [==============================] - 6s 31ms/step - loss: 3.4414 - accuracy: 0.9312 - val_loss: 6.6532 - val_accuracy: 0.9040
Epoch 4/16
188/188 [==============================] - 5s 25ms/step - loss: 2.0043 - accuracy: 0.9482 - val_loss: 7.2783 - val_accuracy: 0.8860
Epoch 5/16
188/188 [==============================] - 4s 23ms/step - loss: 1.6755 - accuracy: 0.9533 - val_loss: 6.4782 - val_accuracy: 0.9030
Epoch 6/16
188/188 [==============================] - 7s 36ms/step - loss: 1.0774 - accuracy: 0.9618 - val_loss: 7.8521 - val_accuracy: 0.8950
Epoch 7/16
188/188 [==============================] - 7s 35ms/step - loss: 0.7761 - accuracy: 0.9700 - val_loss: 7.0253 - val_accuracy: 0.9013
Epoch 8/16
188/188 [==============================] - 5s 25ms/step - loss: 0.8314 - accuracy: 0.9713 - val_loss: 8.1202 - val_accuracy: 0.8867
Epoch 9/16
188/188 [==============================] - 7s 36ms/step - loss: 0.8124 - accuracy: 0.9707 - val_loss: 7.6576 - val_accuracy: 0.9017
Epoch 10/16
188/188 [==============================] - 7s 36ms/step - loss: 0.5683 - accuracy: 0.9743 - val_loss: 7.8524 - val_accuracy: 0.9033
Epoch 11/16
188/188 [==============================] - 7s 36ms/step - loss: 0.5871 - accuracy: 0.9785 - val_loss: 7.4454 - val_accuracy: 0.9013
Epoch 12/16
188/188 [==============================] - 7s 36ms/step - loss: 0.3817 - accuracy: 0.9823 - val_loss: 8.5641 - val_accuracy: 0.9067
Epoch 13/16
188/188 [==============================] - 7s 37ms/step - loss: 0.4446 - accuracy: 0.9832 - val_loss: 7.9446 - val_accuracy: 0.9053
Epoch 14/16
188/188 [==============================] - 5s 29ms/step - loss: 0.3915 - accuracy: 0.9828 - val_loss: 9.0475 - val_accuracy: 0.8967
Epoch 15/16
188/188 [==============================] - 4s 24ms/step - loss: 0.4196 - accuracy: 0.9852 - val_loss: 7.9861 - val_accuracy: 0.9043
Epoch 16/16
188/188 [==============================] - 4s 24ms/step - loss: 0.3301 - accuracy: 0.9857 - val_loss: 9.0022 - val_accuracy: 0.9033
In [228]:
pd.DataFrame(history.history)[['accuracy', 'val_accuracy']].plot()
plt.show()
In [229]:
pd.DataFrame(history.history)[['loss', 'val_loss']].plot()
plt.show()
In [230]:
test_model = keras.models.load_model('cnn_checkpoint_feature_extraction.keras')
test_model.evaluate(test_features, test_labels)
94/94 [==============================] - 1s 9ms/step - loss: 6.6769 - accuracy: 0.9023
Out[230]:
[6.676905155181885, 0.9023333191871643]

Transfer Learning - Fine Tuning¶

In this step, I will use the same pre-trained model as before (VGG16) but apply fine tuning to better train the model to work with my data set.

The model took forever to train, about an hour and a half. Even after all of this training, the model still only performed with an 88.8% accuracy on the test set. That means that this model actually performed worse than the previous model did that used feature extraction.

In [231]:
# This sets only the last four layers to be trainable through backpropagation
vgg.trainable = True
for layer in vgg.layers[:-4]:
    layer.trainable = False
In [232]:
inputs = keras.Input(shape=(180, 180, 3))
x = keras.applications.vgg16.preprocess_input(inputs)
x = vgg(x)
x = layers.Flatten()(x)
x = layers.Dense(128)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(6, activation='softmax')(x)

model = keras.Model(inputs, outputs)
model.compile(loss='categorical_crossentropy',
              optimizer=keras.optimizers.RMSprop(learning_rate=1e-5),
              metrics=['accuracy'])

model.summary()
Model: "model_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_8 (InputLayer)        [(None, 180, 180, 3)]     0         
                                                                 
 tf.__operators__.getitem_2   (None, 180, 180, 3)      0         
 (SlicingOpLambda)                                               
                                                                 
 tf.nn.bias_add_2 (TFOpLambd  (None, 180, 180, 3)      0         
 a)                                                              
                                                                 
 vgg16 (Functional)          (None, 5, 5, 512)         14714688  
                                                                 
 flatten_10 (Flatten)        (None, 12800)             0         
                                                                 
 dense_10 (Dense)            (None, 128)               1638528   
                                                                 
 dropout_5 (Dropout)         (None, 128)               0         
                                                                 
 dense_11 (Dense)            (None, 6)                 774       
                                                                 
=================================================================
Total params: 16,353,990
Trainable params: 8,718,726
Non-trainable params: 7,635,264
_________________________________________________________________
In [233]:
modelcheckpoint = ModelCheckpoint(filepath='cnn_checkpoint_fine_tuning.keras', save_best_only=True, monitor='val_loss')

history = model.fit(train, epochs=16, validation_data=pred, callbacks=[modelcheckpoint])
Epoch 1/16
188/188 [==============================] - 849s 5s/step - loss: 3.9684 - accuracy: 0.5003 - val_loss: 0.8098 - val_accuracy: 0.7377
Epoch 2/16
188/188 [==============================] - 571s 3s/step - loss: 0.9318 - accuracy: 0.7310 - val_loss: 0.5447 - val_accuracy: 0.8410
Epoch 3/16
188/188 [==============================] - 626s 3s/step - loss: 0.5518 - accuracy: 0.8345 - val_loss: 0.4677 - val_accuracy: 0.8683
Epoch 4/16
188/188 [==============================] - 586s 3s/step - loss: 0.3934 - accuracy: 0.8820 - val_loss: 0.4222 - val_accuracy: 0.8803
Epoch 5/16
188/188 [==============================] - 595s 3s/step - loss: 0.2631 - accuracy: 0.9170 - val_loss: 0.4192 - val_accuracy: 0.8867
Epoch 6/16
188/188 [==============================] - 589s 3s/step - loss: 0.2165 - accuracy: 0.9340 - val_loss: 0.4001 - val_accuracy: 0.8970
Epoch 7/16
188/188 [==============================] - 589s 3s/step - loss: 0.1621 - accuracy: 0.9495 - val_loss: 0.4164 - val_accuracy: 0.8997
Epoch 8/16
188/188 [==============================] - 585s 3s/step - loss: 0.1091 - accuracy: 0.9633 - val_loss: 0.4263 - val_accuracy: 0.9030
Epoch 9/16
188/188 [==============================] - 583s 3s/step - loss: 0.0925 - accuracy: 0.9717 - val_loss: 0.4378 - val_accuracy: 0.9037
Epoch 10/16
188/188 [==============================] - 582s 3s/step - loss: 0.0672 - accuracy: 0.9792 - val_loss: 0.4504 - val_accuracy: 0.9083
Epoch 11/16
188/188 [==============================] - 586s 3s/step - loss: 0.0454 - accuracy: 0.9865 - val_loss: 0.4683 - val_accuracy: 0.9070
Epoch 12/16
188/188 [==============================] - 587s 3s/step - loss: 0.0351 - accuracy: 0.9888 - val_loss: 0.4992 - val_accuracy: 0.9087
Epoch 13/16
188/188 [==============================] - 589s 3s/step - loss: 0.0272 - accuracy: 0.9917 - val_loss: 0.5269 - val_accuracy: 0.9083
Epoch 14/16
188/188 [==============================] - 572s 3s/step - loss: 0.0220 - accuracy: 0.9927 - val_loss: 0.5506 - val_accuracy: 0.9083
Epoch 15/16
188/188 [==============================] - 587s 3s/step - loss: 0.0169 - accuracy: 0.9952 - val_loss: 0.5798 - val_accuracy: 0.9117
Epoch 16/16
188/188 [==============================] - 578s 3s/step - loss: 0.0146 - accuracy: 0.9958 - val_loss: 0.6038 - val_accuracy: 0.9067
In [234]:
pd.DataFrame(history.history)[['accuracy', 'val_accuracy']].plot()
plt.show()
In [235]:
pd.DataFrame(history.history)[['loss', 'val_loss']].plot()
plt.show()
In [236]:
test_model = keras.models.load_model('cnn_checkpoint_fine_tuning.keras')
test_model.evaluate(test)
94/94 [==============================] - 164s 2s/step - loss: 0.4373 - accuracy: 0.8913
Out[236]:
[0.4373050630092621, 0.8913333415985107]

Conclusion¶

In this project, I compared three models to each other to determine which one gave me the best accuracy in terms of classification performance. The best model was the CNN model that leveraged the power of VGG16 using feature extraction, which ended up giving a performance accuracy of about 90.7% on the test set. The CNN model that used fine tuning of the VGG16 model had an accuracy of 88.8%, and the baseline CNN model had an accuracy of 78.2%.

First, after observing the classification report, we can see that the model is extremely good at predicting classes 1 and 4 (forests and the sea) but is not so good at predicting classes 2 and 3 (glacier and mountain).

In [248]:
best_model = keras.models.load_model('cnn_checkpoint_feature_extraction.keras')
In [253]:
predictions = best_model.predict(test_features)
94/94 [==============================] - 0s 5ms/step
In [264]:
test_labels = np.argmax(test_labels, axis=1)
predictions = np.argmax(predictions, axis=1)
---------------------------------------------------------------------------
AxisError                                 Traceback (most recent call last)
Input In [264], in <module>
----> 1 test_labels = np.argmax(test_labels, axis=1)
      2 predictions = np.argmax(predictions, axis=1)

File <__array_function__ internals>:180, in argmax(*args, **kwargs)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\core\fromnumeric.py:1216, in argmax(a, axis, out, keepdims)
   1129 """
   1130 Returns the indices of the maximum values along an axis.
   1131 
   (...)
   1213 (2, 1, 4)
   1214 """
   1215 kwds = {'keepdims': keepdims} if keepdims is not np._NoValue else {}
-> 1216 return _wrapfunc(a, 'argmax', axis=axis, out=out, **kwds)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\core\fromnumeric.py:57, in _wrapfunc(obj, method, *args, **kwds)
     54     return _wrapit(obj, method, *args, **kwds)
     56 try:
---> 57     return bound(*args, **kwds)
     58 except TypeError:
     59     # A TypeError occurs if the object does have such a method in its
     60     # class, but its signature is not identical to that of NumPy's. This
   (...)
     64     # Call _wrapit from within the except clause to ensure a potential
     65     # exception has a traceback chain.
     66     return _wrapit(obj, method, *args, **kwds)

AxisError: axis 1 is out of bounds for array of dimension 1
In [267]:
from sklearn.metrics import classification_report, confusion_matrix
print(classification_report(test_labels, predictions))
              precision    recall  f1-score   support

           0       0.94      0.86      0.90       437
           1       0.99      0.98      0.99       474
           2       0.82      0.86      0.84       553
           3       0.85      0.81      0.83       525
           4       0.94      0.96      0.95       510
           5       0.89      0.95      0.92       501

    accuracy                           0.90      3000
   macro avg       0.91      0.90      0.90      3000
weighted avg       0.90      0.90      0.90      3000

In the graph below, we can see that classes 2 and 3 (glaciers and mountains) get confused with each other often by the model, as do classes 0 and 5 (buildings and the street).

In [269]:
import seaborn as sns

sns.heatmap(confusion_matrix(test_labels, predictions), annot=True, cmap='Reds')
plt.title('Prediction and Actuals')
plt.show()

Below, we can visualize some of the images that the model misclassified.

In [278]:
# categories = ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']
misclass = test_labels != predictions

misclassified_images = raw_test_images[misclass]
actual_labels = test_labels[misclass]
predicted_labels = predictions[misclass]

plt.figure(figsize=(10, 10))
for i, n in enumerate(misclassified_images[:9]):
    ax = plt.subplot(3, 3, i+1)
    plt.imshow(n.astype('uint8'))
    
print('predictions', predicted_labels[:9])
print('actuals    ', actual_labels[:9])
plt.show()
predictions [4 0 2 5 2 5 2 3 3]
actuals     [2 5 3 0 1 0 3 2 4]

The images above were all misclassified by the best performing model. Most of these images seem fairly easy for me, a human being, to classify. For example, the bottom left image is clearly a mountain, but the model classified it as a glacier. Other images, such as the top left one, are difficult even for me to classify since the shapes and colors are unclear.

Overall, the models were able to classify the images with a fairly decent accuracy. In the future, I may try combining feature extraction and fine tuning techniques to create an even better CNN that can extract all of the training from VGG16 and leverage it to learn more about how to classify my specific data set.